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Abstract 

Earlier papers have shown that stress is differentiable at local minima if certain 
conditions on the weights and dissimilarities are satisfied. In this note we show the 
result remains true without these additional conditions. 
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Note: This is a working paper which will be expanded/updated frequently. All suggestions 
for improvement are welcome. The directory deleeuwpdx.net/pubfolders/zero has a pdf 
version, the bib file, and the complete Rmd hie. 


1 Introduction 

The multidimensional scaling stress loss function is defined as 

ff (A')= yy !%(**-<m V)) 2 , 

where W and A are non-negative symmetric and hollow matrixes of weights and dissimilarities , 
where X is the n x p configuration, and where d t] (X) is the Euclidean distance between rows 
i and j of A". Thus 


d%{X) = (e, - efi)'XX'[ei - efi) = tr X'A^X, 
where ej and ej are unit vectors (columns of the identity matrix), and 

A-ij — (ej — efiVi — efi . 

Multidimensional scaling is minimization of stress over configurations. 
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2 Differentiation 

The directional derivative of a at X in direction Y is defined as 


Da{X, Y) = lim 

e .|.0 


a{X + eY) 
e 


a(X) 


Stress is not differentiable at configurations for which some dij(X) are zero, but it has a finite 
directional derivatives everywhere. We will show this by actually giving the formula, which 
was first given by De Leeuw (1984). See also De Leeuw, Groenen, and Mair (2016). The 
directional derivative is interesting because clearly a necessary condition for cr to have a local 
minimum at X is that Da(X, Y) > 0 for all Y. 

In order to derive a convenient expression for Da(X, Y ) we give some definitions. First some 
indicators for zero distances. 


aij(X) ^ 

fa(X) = 


1 if dij(X) = 0, 

0 if dij(X) > 0. ’ 

0 if dij(X) = 0, 

di i(X) ^ dij(X) > 0 . 


Then some matrices. 


EE "vV 

WijSijPijWAij 


And finally 

0{X,Y)± Y.T. >r, i <\ J n ij (X)d, J (Y). 


Theorem 1: Da(X, Y) = tr Y'(V - B{X))X - 9(X, Y) 
Proof: We have 


<kj{X + eY) 


edi^Y) 

dij(X) + e^-fi^ytr Y'AijX + o(e) 


if dij{X) = 0, 
if dij()X) > 0. 


The rest is simple computation. QED 

Theorem 2: If a has a local minimum at X then B(X)X = VX and 9(X,Y) = 0 for all Y. 

Proof: Suppose (V — B(X))X ^ 0. Then we can find Y such that tr Y'(V — B(X))X < 0, 
and because 9(X, Y) > 0 we have Da^X, Y) < 0. Suppose 9(X, Y) > 0 for some Y. If 
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tr Y'(V - B(X))X < 0 we have Da(X, Y ) < 0, and if tr Y'(V - B(X))X > 0 we replace Y 
by —Y, and again Da(X , Y ) < 0. QED 

Corollary 1: If a has a local minimum at X then dij(X) > 0 for all (i,j) with WijSij > 0. 

Proof: If there is an (i,j) such that ) > 0 then there is a Y such that 9(X, Y) > 0. 

QED 

Corollary 2: If a has a local minimum at X then = 0 for all (i,j) with dij(X) = 0. 

Proof: This is just another way of saying that 6(X,Y) = 0 for all Y. QED 
Corollary 3: If > 0 for all i =/=■ j then a is differentiable at a local minimum. 

3 Final Result 

Corollary 3 in the previous section allows for the possibility that cr is not differentiable at 
local minima if WijSij = 0 for some index pairs This is important, for example, in 

unfolding where indices 1,2,*-- , n are partitioned into two disjoint subsets, and all within- 
subset weights are zero. It turns out that a fairly trivial manipulation allows us to find the 
appropriate generalization of our result. 

Theorem 2: o is differentiable at local minima. 

Proof: An essentially equivalent way to formulate the MDS problem is to minimize 

I< 

*{X) = Y,MSk-d k {X))*, 

k= 1 

where dk(X) = ^Jtr X'AijX for some pair 1 < i < j < n. Thus we fit distances to some, but 
not necessarily all, of the dissimilarities. In this formulation we can clearly assume without 
loss of generality that > 0 for all 1 < k < K. Suppose /C 0 and /Ci are the subsets of 
indices for which, respectively, 8k = 0 and 8k > 0. Then 

tr(A') = V w k (6 k - d k (X)) 2 + y w k dl(X). 

keKi keKo 

By the same reasoning as before at a local minimum X we have dk(X) > 0 for all k G /Ci, 
which implies that cr is differentiable at that local minimum. QED 

Note that dk(X) > 0 for all k & JC i actually implies more: stress is infinitely many times 
differentiable in an open neighborhood of each local minimum. 
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